Here it is: the ggplot advent calendaR! The “gg” in ggplot refers to the grammar of graphics. For the next 25 days, we will go through an introduction to the grammar of graphics, make a lot of visualizations (some good, some bad), and learn some of the basic functions and features of the ggplot2 package. NOTE I am no expert in ggplot. I literally learned while creating this tutorial, and that was a big motivation in doing this. If there are errors or smoother ways to do the same thing, please let me know! R is about constant learning and improving, ggplot is no different.

DAY 1

On the first day of Christmas… we’re jumping into the tidyverse!

ggplot is part of the tidyverse, a group of packages that also includes dplyr, readr, and other very helpful packages that you should have! You can install and load ggplot separately, but… why? (:

The package we’re using is actually called ggplot2. Super-duper fun fact: “ggplot2 is called ggplot2 because once upon a time there was just a library ggplot. However, the developer noticed that it used an inefficient set of functions. In order for not to break the API, the authors introduced a successor package ggplot2. However, the central function in this package is still called ggplot(), not ggplot2()!” (wasn’t that fun? Source: Freeman & Ross, 2019).

Install and load tidyverse:

library(tidyverse)
Registered S3 methods overwritten by 'dbplyr':
  method         from
  print.tbl_lazy     
  print.tbl_sql      
── Attaching packages ───────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.2 ──✔ ggplot2 3.3.6      ✔ purrr   0.3.4 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.2.1      ✔ stringr 1.4.1 
✔ readr   2.1.2      ✔ forcats 0.5.2 ── Conflicts ──────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ ggplot2::annotate() masks ctmm::annotate()
✖ dplyr::filter()     masks stats::filter()
✖ dplyr::lag()        masks stats::lag()

Let’s also load our data using another tidyverse package, readr, and then view the data. We will be working with two datasets throughout this tutorial:

Using readr and read_csv(), instead of read.csv() from base R, loads our data in as tibbles. Why use tibbles over traditional dataframes? Three reasons: (1) the input types aren’t automatically changed when you read in the data, (2) you can keep lists as columns, and (3) you can use non-standard variable names (e.g., starting with a number, as in “1st_place”). Thank you to my friend Pat for this explanation!

DAY 2

On the second day of Christmas…

…we learned the language of ggplot aka the Grammar of Graphics! ggplot is different from base R graphics. How? Base R graphics work on the individual vectors, ggplot works on dataframes (Source: Prabhakaran, 2017). ggplot works by adding layer upon layer to create your visualization. In base R graphics, you put all the information in one code and it spits out a graphic. In ggplot, you build your grahpic with layers.

What are some of the different components of a graphic (Source: Freeman & Ross, 2019)? - data - geometric objects (geoms) - aesthetics - statistical transformations - position adjustments - scale - coordinate system - facets - themes

The first layer of ggplot is always…ggplot. If you run the code below, you will get a blank graphic with a grey background. The grey background is the ggplot default. In base R, you can’t run just function, e.g., boxplot(), without a vector in the brackets.

Now let’s add another layer to ggplot. We’ll specify the dataset and what we want our x and y axes to be.

When you run this code, you should see that we now have axes and labels on top of our grey background. Note, you can drop the x= and y= and the code will run the same (don’t take my word for it, try it yourself!).

DAY 3

On the third day of Christmas…

…we added data to our graphic and explored geom layers!

Geoms, or geometric objects, are graphical representations of the data. There are many many types of geoms (here’s a long list of examples: https://ggplot2.tidyverse.org/reference/#geoms). Let’s try a few.

Geom layers start with “geom_” and are followed by the type of geometric object, e.g., “geom_point” or “geom_line”. Because we’re adding a new layer onto our graphic, we use a + after our first line of code. To keep things tidy and easy to read, I usually start my new layer on a new line.

Since we are working with categorical data (types of trees), the points all fall on three lines. Let’s change our x and y axes before moving on to a different type of geom.

Now we’re comparing two continuous variables (xmas magic and tree height) so the points are scattered across our graphic.

Let’s look at some different geom types. We’ll go back to type vs. height for this one.

When creating graphics, always consider the type of data you’re working with (e.g., continuous vs. discrete). The type of data you’re working with should determine the type of geom you choose. Some geoms won’t run properly if the type of data you’re inputting doesn’t work with those data types.

One more example of a geom layer using the trees data.

DAY 4

On the fourth day of Christmas…

…we introduced pipes!

This is a bit of a segue from ggplot and you certainly don’t need to use pipes to create beautiful ggplot graphics BUT it might make your data look more tidy.

The pipe operator (%>%) is included in the tidyverse, so if you loaded the tidyverse, then you’ve got access to %>%. From R for Data Science, “Pipes are a powerful tool for clearly expressing a sequence of multiple operations…The point of the pipe is to help you write code in a way that is easier to read and understand.” In words, %>% means “and then”. By using a pipe, you’re telling R to run the first line of code AND THEN run the next line of code.

Why do I bring this up now? Previously, we ran this code to produce a boxplot showing tree heights:

Often, in online examples and tutorials, you will see ggplot codes written with the dataframe listed first, then a pipe leading into the ggplot code. If you run the code below, you’ll see that we can produce the exact same graphic:

Another tip from my friend Pat, “The most useful application of the pipe is to plug the result of one function into another without creating intermediate data frames” Here’s an example of what this might look like:

my_data %>% function1(arguments, etc) %>% function2(arguments, etc)

DAY 5

On the fifth day of Christmas…

…we started working with the aesthetics of our graphic!

Aesthetic mapping describes the visual properties of the graphic. We can change the aesthetics of ggplot() or each layer we add to it. We use aes() and then customize our graphic to appear how we’d like it. On Day 3, we created some pretty boring graphs: white boxes or violins, black dots, and grey backgrounds. While this is fine for exploring data, it’s perhaps not how we want our finished product to look!

Today, we’re going to work on changing the colours of our data. Let’s start with the scatterplot, i.e., the geom_point() graphic.

To change the aesthetics of the geom_point() dots, we add “aes()” within the parentheses of geom_point(). You may have noticed that we already used “aes()” when we told ggplot what we wanted our X and Y axes to be. Now we will also tell ggplot to assign different colours to our points by tree type. In this case, we are not choosing the colours.

Now instead of changing the aesthetics of the ggplot() layer, let’s instead change the aesthetics of the geom_point() layer.

Looks the same right? So what’s the difference? Right now, nothing appears different, but later it may impact how your graphic looks. When you change the aesthetics of the ggplot() layer, those aesthetics will be applied as the default to all of your subsequent layers. If you change the aesthetics of an individual layer, it will only be applied to that layer and it will override the default.

What if we want to use all the same colour for our points? Find out tomorrow!

DAY 6

On the sixth day of Christmas…

…we continued with aesthetic mapping and colours!

So you want all your points to be one colour. Sounds easy-peasy right? Not quite.

Here’s why it’s a bit confusing. I had to do a bit of digging to understand exactly why this is…

When we want to map a variable of our data (e.g., telling ggplot we want to colour by tree type), we put aes() inside the geom_point(). If we want to apply a constant colour (constant value) to our points (e.g., telling ggplot we want all our points to be blue), we put aes() OUTSIDE geom_point). Try both the codes below to see what happens.

Note: in this case you could also leave out the aes() within geom_point() and it would run the same.

For more of an explanation on why aes() works this way, you can check out this helpful thread on stackoverflow that helped me: https://stackoverflow.com/questions/41863049/when-does-the-argument-go-inside-or-outside-aes. Here’s another really fantastic resource: https://drive.google.com/file/d/1Dvul1p6TYH6gWJzZRwpE0YX1dO0hDF-b/view.

DAY 7

On the seventh day of Christmas…

…we changed the colours of boxplots!

Try running the same code from yesterday but on a geom_boxplot() graphic.

Hmm… okay. But what if we wanted the inside of the boxes to be coloured, not the outline?

Instead of “colour =” we use “fill =” instead.

This works for points too. The default for geom_point() are solid points, but you can change these to points with different outline and fill colours.

We can also change the colours of our boxplots by tree type, as we did with the points on Day 5. Remember from yesterday that because we want to change the colours by a variable (i.e., multiple colours determind by the levels of the variable) rather than change using a constant value (i.e., single colour), we put this INSIDE aes().

Now that we’re playing around with colours. It’s time I introduce you to a good colour resource: http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf. There are many great online resources about colours, but I like this one made by Dr. Ying Wei.

DAY 8

On the eight day of Christmas…

…we finally took a break from talking about colours!

Colour is a big one, but let’s work with some other changes to the aesthetics of our graphic. Let’s bring back the geom_point() graphic. I’m using the theme of Christmas, but of course, December has many, many holidays. Hanukkah starts on Dec. 17 this year, so let’s make a Hanukkah-inspired graphic and look at different ways to change the aesthetics of our graphs.

Here we’ve specified the shape, the colour, the size, and the stroke (line thickness) of the points. There are many changes we can make to aesthetics, these are just a few examples. Let’s take a look at a line graph and how we can customize that.

We haven’t made a line graph yet, so let’s create a simple one first, before we customize it.

Here we’ve specified that we want each line to represent a different tree type by using “group=”. This goes inside aes() because it is being applied to the data (i.e., it can’t be done without this specific dataset)

Now let’s spruce it up. (“spruce” it up… because they’re trees… hahahaha. A little Christmas cheer for you).

What if we want to show the points AND the lines? Tomorrow we will add another layer to our graphic.

DAY 9

On the ninth day of Christmas…

… we added another geom_ layer to our graphic.

Here’s our geom_lin() graphic from yesterday:

Now let’s add points to it as well. It’s as simple as + geom_point()!

And if you want to change the aesthetics of those points, where do you do it? Within geom_point()!

They kind of look like Christmas lights :)

DAY 10

On the tenth day of Christmas…

…we edited the text of our graphics.

Editing the look of your text and fonts in ggplot is easy, with lots of options to make it look exactly how you want it. Here’s our graphic from yesterday.

What kind of things might we want to change? Maybe we want a title and more informative and cleaner-looking axis labels. To do this, we need to add new layer, labs() for labels. You can also use xlab(), ylab(), and ggtitle() to add them individually.

What if we want to make more aesthetic changes to our fonts and labels? We will need to use themes!

DAY 11

On the eleventh day of Christmas…

…we introduced themes!

We can use themes to make finer adjustments to non-data parts of our graphic. While labs() is fine for adding labels and a title, themes allow us to choose the size, font, colour, position, etc. of that text. Today I’m going to introduce you to some complete themes.

Let’s go back to our nice, clean-looking boxplots:

While the grey backgronud is the default in ggplot, it’s certainly not a requirement. That’s often one of the first things I change!

Theme_bw() is a complete theme, meaning that it’s a theme that can be applied as a layer that changes the look of your overall plot. Theme_gre() is the default. You can also try other complete themes, such as theme_dark(), theme_light(), theme_classic(), theme_void() and more.

Personally, I like this one best!

DAY 12

On the twelfth day of Christmas…

… we worked with theme() and more control with non-data parts of our graphics!

Here, I’m keeping theme_classic() and building on top of that.

In this case, I wanted to show you how you can keep the grid lines, while changing to theme_classic(). We’ve also changed the style and size of the title, moved the legend and changed the colours, changed the colour of the y axis label, and changed the size of the x axis tick labels. This is by no means a pretty graphic, but hopefully it gives you an idea of different ways we can change features of our graphic. There are many more changes you could make and lots of great online tutorials that cover this in more detail.

DAY 13

On the thirteenth day of Christmas…

… we introduced scales!

To do this, we need to introduce scales. We’ll be working with scales for a few days - get excited! Scales allow us to override defaults. Similar to themes, scales allow us more control over what our graphic looks like, but scales focus on changing the look of the data.

Let’s start with position scales. The most commonly used are scale_x_continuous() and scale_y_continuous(). Since we are working with categorical data right now (tree types), we could swap out scale_x_continuous() for scale_x_discrete(). Using these, we can set the limits of our scales. We don’t need to change the limits of our discrete (x) axis, but let’s change the limits of our y axis.

Here’s our boxplot graphic, but I’ve changed our y axis to Christmas magic instead of tree height:

trees %>%
ggplot(aes(x=type, y=xmas.magic))+
  geom_boxplot(aes(fill=type), colour="black")+
  labs(title="Christmas Trees", x=NULL, y="Christmas magic")+
  theme_classic()

Now let’s try changing the y-axis limits:

Note: you can also use lims(x=c(#,#), y=c(#,#)) (replacing the #s with your desired limits). This is simpler and faster, but we will stick with scale_y_continuous() because we will make additional adjustments below.

When we increased our y-axis limits, it changes the labels of our y-axis ticks to include 0.5s. Maybe we’d rather have whole numbers or maybe just fewer numbers altogether. We can specify this using the same scale_y_continuous() but adding “breaks=”.

If we don’t want any breaks we can specify by using “breaks=NULL”.

Finally, we can modify the axis tick labels. Let’s change the names of our x-axis tick labels.

There’s much much more that can be done with position scales. I suggest taking a look at Ch. 10 of this book by Hadley Wickham, Danielle Navarro, and Thomas Lin Pedersen, which I relied on heavily for this section. https://ggplot2-book.org/scale-position.html#scale-position

DAY 14

On the fourteenth day of Christmas…

… we worked with colour scales!

We’ve been seeing the same three colours over and over: red, green, blue. But what if we want to specify the colours when we choose colour or fill by type? This is where colour scales come in.

We can choose a different colour palettes by installing colour palette packages and loading them.

Here is a good resource, RColorBrewer.

install.packages("RColorBrewer")
library(RColorBrewer)

Let’s choose a palette from RColorBrewer. You can find the names and palettes here: https://r-graph-gallery.com/38-rcolorbrewers-palettes.html or you can run this code to display them in R:

Here we’ll create our boxplots again, so we will want to change the “fill” rather than the “colour”, therefore we use scale_fill_brewer(). We’re going to try the Dark2 palette:

What if you like the palette but don’t like how ggplot applied the colours? Or maybe you can’t find the perfect palette and want to create your own? We can manually assign colours too… but we’ll wait until tomorrow for that one!

DAY 15

On the fifteenth day of Christmas…

…we learned how to assign colours manually!

For this we use scale_color_manual() and/or scale_fill_manual(). For the boxplots, we will use scale_fill_manual().

The colours will be assigned in the order we gave them, so you can also repeat a colour (e.g., green, red, green) and it will be assigned in that order. Instead of colour names, you can also use the color codes (e.g., #E69F00).

We’ll be using these colours again and again, so why don’t we save them as a vector?

xmas <-c("darkgreen", "firebrick2", "mediumseagreen")

You can also specific the colours that will be assigned to each level of your variable:

Christmas colours aren’t necessarily colour-blind friendly. There are lots of fantastic resources when considering colour-blind friendly palettes for your graphics. Here’s one: https://colorbrewer2.org/#type=sequential&scheme=YlGnBu&n=9

DAY 16

On the sixteenth day of Christmas…

…we worked with a different type of plot and added multiple geoms to one plot!

We haven’t even touched the sleigh dataset! Let’s create a graphic with that so we can work with a different type of plot.

Here we’ve told ggplot to colour each sleigh type a different colour (some are nearly impossible to differentiate but we won’t worry about that today).

Maybe we want to add a trendline to our plot. We can do this by adding a geom_smooth() layer:

And maybe we want to customize that line. The default is a blue line with grey background. We can change that to a black line using “colour =” and we can change the transparency of the error using “alpha =”. We can also change the method of how the line is calculated using “method =”

Alpha can also be helpful when you have overlapping points.

More detailed and additional information on colour scales can be found here: https://ggplot2-book.org/scale-colour.html

DAY 17

On the seventeenth day of Christmas…

…we edited our legend!

Want to move your legend? Change the shape, style, size? Get rid of it completely?

Let’s go back to our boxplots.We can remove the legend using “show.legend = FALSE”. We put this in the geom_boxplot() since the legend is linked to that layer.

Another option is to use guides(). In this case we use fill=“none” but if we were working with colour instead of fill, we would type colour=“none”. This way of removing the legend is handy in instances when you have multiple geoms. We can add one little line of code to remove the legend, instead of typing “show.legend=F” into each geom layer.

We can also change the text of the legend. When we changed the x-axis labels, the legend didn’t change with it. Let’s fix that now. We do this by adding the same “labels = c(…” in scale_fill_manual() as well as scale_x_discrete(). Why? Because scale_fill_manual() refers to the colours of your data, and the legend represents that (they are directly linked). scale_x_discrete() is focused solely on the x-axis.

REFRESHER What if we want to change the position of the legend or the colour of the text?? Remember back to when we talked about themes? If we want to make changes to anything that’s not related to the data (i.e., it could be a plot of anything or one without any data in it), we use THEMES.

DAY 18

On the eighteenth day of ChRistmas…

… we learned about guides!

Yesterday, we made some edits to our legend using scales and themes. Today, we will introduce one more way to exercise more fine control over your graphics: guides() and guide_ functions! Guides, like our legends and axes, help us or our audience interpret our plots. We can use guides() or the guide_ argument _*() functions to make additional changes to our legends and axes. Here’s a great explanation of scales and guides from Ch.15 of Wickham, Navarro and Pederson’s book, which I highly recommend you check out: https://ggplot2-book.org/index.html

“Formally, each scale is a function from a region in data space (the domain of the scale) to a region in aesthetic space (the range of the scale). The axis or legend is the inverse function, known as the guide: it allows you to convert visual properties back to data. You might find it surprising that axes and legends are the same type of thing, but while they look very different they have the same purpose: to allow you to read observations from the plot and map them back to their original values.”

Let’s look at some examples. Let’s try a new graphic with our data, and we’ll use a gradient colour scale to colour our points based on the amount of “Christmas magic” in our trees:

Before we get back to guides, let’s quikcly chat about the gradient scale. There are many, many ways you can edit the colours, but in this case we told ggplot that we wanted to change the colour of our points with a gradient “scale_colour_continuous()” and then we set the high and low colours. We could have also set the middle colour or chosen an existing gradient. Learn more here: https://ggplot2-book.org/scale-colour.html

Back to guides! (colours are just so distracting!!)

We can make additional edits to our legends using “+ guides()” or by specifying the “guide =” argument within our scale layer (scale_colour_continuous(), which corresponds with our legend).

Here we’ve flipped our bar horizontally and increased the size of the legend. No changes have been made to the rest of our graphic. We can achieve the exact same output by adding “guide =” to our scale_colour_continuous() layer.

trees %>%
ggplot(aes(x=needle.drop, y=height))+
  geom_point(aes(colour=xmas.magic), size=2)+
  theme_classic()+
  scale_colour_continuous(low="red", high="mediumseagreen", guide = guide_colourbar(reverse=TRUE, direction = "horizontal", barheight=unit(2, "cm")))

Here are a couple more ways we can use guides to edit our legend. Let’s change up our plot a bit.

There’s a bit of overlap in our points, so let’s adjust the transparency using alpha:

But maybe we don’t want our legend to also have transparent points, so we can use a guide to override this aesthetic change.

Finally, let’s use guides to change the aesthetics of our axes. We will go back to our sleighs dataset for this one.

sleighs %>%
ggplot(aes(x=name, y=deerpower))+
  geom_point()+
  labs(x="Sleighs")+
  theme_classic()

As you can see, the names of the sleighs are impossible to read. Let’s flip the labels at the bottom so that they run vertically instead.

sleighs %>%
ggplot(aes(x=name, y=deerpower))+
  geom_point()+
  theme_classic()+
  labs(x="Sleighs")+
  guides(x=guide_axis(angle=90))

Much better! Hopefully now you have an idea of some of the ways you can edit your guides (legends, axes) using guides() and guide =.

Day 19

On the nineteenth day of Christmas…

… we made position adjustments!

Position adjustments are handy if you have overlapping geoms or data. You can override the default using the position argument in the geom_() functiions.

Instead of boxplots, let’s look at the raw data points using a different type of geom later, geom_jitter().

Position adjustments come in handy with point data like this, more so when we’re working with large datasets that have many points. Let’s adjust the position of our jittered points.

Let’s look at another example. We’ll bring back our scatterplot of sleigh data but I’m going to cut it down a bit to make it a easier to work with. I’ll do this using another tidyverse package, dplyr.

That legend is fine, but let’s get rid of it and instead label each point.Do you remember how to remove the legend?

sleighs.subset %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3, show.legend = F)+
  geom_text(aes(label=name))+
  theme_classic()

Well this might work better, but the labels are all overlapping and difficult to read. This is where position_nudge() comes in handy!

And because our Stealth Sleigh is off the plot, let’s fix that using what we learned on Day 13 about limits.

DAY 20

On the twentieth day of Christmas…

… we did some position adjustments with bar plots!

First, let’s create a barplot since we haven’t done that yet. We’ll base it on our previous sleighs subset.

sleighs.subset %>%
  ggplot(aes(x=bells, y=reins))+
  geom_col()+
  scale_x_discrete(limits=c(4, 6, 8))+
  theme_classic()
Warning: Continuous limits supplied to discrete scale.
Did you mean `limits = factor(...)` or `scale_*_continuous()`?

The sleighs come with 4, 6, or 8 bells. So here we’re displaying the counts of sleighs in each category (# of bells). But maybe we want to see some additional information in our barplot, such as the number of reins on the sleigh. I’ve heard that Santa takes these things super seriously, so this is completely practical and reasonable plot. Note that we have to specify that reins is a categorical variable, not a continuous one, using as.character(). In this case, we can’t have 3.5 reins.

sleighs.subset %>%
  ggplot(aes(x=bells, y=reins, fill=as.character(reins)))+
  geom_col()+
  scale_x_discrete(limits=c(4, 6, 8))+
  theme_classic()+
  scale_fill_manual(values=xmas)
Warning: Continuous limits supplied to discrete scale.
Did you mean `limits = factor(...)` or `scale_*_continuous()`?

The default is a stacked barplot (position = “stack”), but there are other ways we could display this using position adjustments. This option shows it as a percent using position = “fill”.

sleighs.subset %>%
  ggplot(aes(x=bells, y=reins, fill=as.character(reins)))+
  geom_col(position="fill")+
  scale_x_discrete(limits=c(4, 6, 8))+
  theme_classic()+
  scale_fill_manual(values=xmas)
Warning: Continuous limits supplied to discrete scale.
Did you mean `limits = factor(...)` or `scale_*_continuous()`?

We can also position them side by side using position = “dodge”. Note that the red bar on the left and the green bar on the right are two bars side by side.

sleighs.subset %>%
  ggplot(aes(x=bells, y=reins, fill=as.character(reins)))+
  geom_col(position="dodge")+
  scale_x_discrete(limits=c(4, 6, 8))+
  theme_classic()+
  scale_fill_manual(values=xmas)
Warning: Continuous limits supplied to discrete scale.
Did you mean `limits = factor(...)` or `scale_*_continuous()`?

Want a review? Try changing the name of your legend.

DAY 21

On the twenty-first day of Christmas…

…we learned about faceting!

Faceting produces smaller graphs that can be displayed alongside one another. We use facet_wrap() and facet_grid() for this.

Let’s start with facet_wrap(). Remember our line graph that looks a bit like Christmas lights? Let’s use that. Here it is, as a reminder:

xmas
Error: object 'xmas' not found

Now, instead of having all three lines on one plot, let’s create three smaller plots and display them together.

Let’s do the same thing but using facet_grid(). The syntax is a little different, but we’ve produced the exact same set of plots. In our case, “.~type” puts the plots side by side.

trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
 geom_line(linetype="dashed")+
  geom_point(size = 3, shape = 8)+
  facet_grid(.~type)+
  guides(colour=FALSE)
Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> = "none")` instead.

If we want to stack our plots instead, we change up the coding within facet_grid(). In our case, “type~.” stacks the plots.

trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
 geom_line(linetype="dashed")+
  geom_point(size = 3, shape = 8)+
  facet_grid(type~.)+
  guides(colour=FALSE)
Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> = "none")` instead.

Our datasets aren’t really set up for this type of grid, but let’s look at plots of reins by bells to show you how you could set up facet_grid() with multiple plots. A plot area is produced with two levels for reins and three levels for bells.

You can also use “scales =” to adjust the scales of all or each of the plots. Let’s go back to our first set of plots from today:

trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(linetype="dashed")+
  geom_point(size = 3, shape = 8)+
  facet_wrap(~type, ncol=3)+
  guides(colour=FALSE)
Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> = "none")` instead.

To adjust the scales, we add “scales =” to facet_wrap().

trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(linetype="dashed")+
  geom_point(size = 3, shape = 8)+
  facet_wrap(~type, ncol=3, scales = "free_y")+
  guides(colour=FALSE)
Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> = "none")` instead.

By choosing “free_y” we have changed the y scale from fixed to free but the x-axis remained the same. Take a look at how the y-axis scale is now different for each plot. Other options are “free_x”, “fixed”, or “free”.

There are lots more things you can do with facets. Check out Ch. 17 of “ggplot2:Elegant Graphics for Data Analysis” for more: https://ggplot2-book.org/facet.html

Want a review? Try changing the style of the points and lines in these plots. Also, try removing the grey background!

DAY 22

On the twenty-second day of Christmas…

…we annotated our plots!

Sometimes we may want to add text to our plots, not just titles and labels, but annotations on the data or plot area. We can do this using geom_text(). We did this on Day 19 when we talked about position adjustments. Now we’re going to discuss annotations in more detail. Let’s bring back that graphic, but with some added text and labels.

Let’s say that the elf in charge wants to send this to Santa but wants to mark, on the plot, which sleigh is her top choice for Santa. We can do this using the annotate() function. We first need to set up our x & y ranges and our caption text.

yrng <- range(sleighs.subset$km_per_carrot)
xrng <- range(sleighs.subset$deerpower)
caption <- paste(strwrap("Sprinkle's top choice: Winter Express"))

Then we can make our plot:

Or may we want to annotate the points directly to show Santa which ones Sprinkle chose as her top recommendations.

Hopefully it’s clear which point we’re referring to here (“Winter Express”) but in case it’s not, we can also highlight it by adding another geom_point() layer. Note that we add additional geom_point() layers but they must go before our original geom_point layer or the orange dots will appear on top of the coloured points (which in this case would also be fine!).

There’s much more you can do with annotations. As usual, I’ll direct you to ggplot2: Elegant Graphics for Data Analysis: https://ggplot2-book.org/annotations.html#direct-labelling

Want a review? Try changing the colour palette of the points in this plot.

DAY 23

On the twenty-third day of Christmas…

…we introduced cowplot!!

“Cowplot????????:

Yes. Cowplot.

Cowplot is an add-on to ggplot and allows us to combine several plots into one.

“How is that different from faceting??”

Cowplot allows you to combine plots of different types into one image!

First, let’s install and load the cowplot package.

install.packages("cowplot")
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.2/cowplot_1.1.1.tgz'
Content type 'application/x-gzip' length 1373845 bytes (1.3 MB)
==================================================
downloaded 1.3 MB

The downloaded binary packages are in
    /var/folders/xp/d3s384tn7f16chkdxc5n6jh80000gn/T//RtmpnNy2R1/downloaded_packages

Now let’s make a few graphs.

sleigh.plot1 <-sleighs.subset %>%
  ggplot(aes(x=km_per_carrot, y=bag_space))+
  geom_point(colour="darkgreen")+
  theme_classic()

sleigh.plot2 <-sleighs.subset %>%
  ggplot(aes(x=bells))+
  geom_bar(fill="firebrick2")+
  theme_classic()

sleigh.plot3 <-sleighs.subset %>%
  ggplot(aes(x=km_per_carrot, y=deerpower))+
  geom_quantile(colour="gold3")+
  theme_classic()

tree.plot1<-trees %>%
  ggplot(aes(x=type, y=height))+
  geom_boxplot(fill="seagreen")+
  theme_classic()

tree.plot2 <-trees %>%
  ggplot(aes(x=xmas.magic))+
  geom_dotplot(fill="tomato3")+
  theme_classic()

Now we can use cowplot to create one image with multiple plots.

plot_grid(tree.plot1, tree.plot2, nrow=1, labels = c("A", "B"))
Bin width defaults to 1/30 of the range of the data. Pick better value with `binwidth`.

Let’s combine all 5 plots into one image.

plot_grid(tree.plot1, tree.plot2, sleigh.plot1, sleigh.plot2, sleigh.plot3, nrow = 2)
Bin width defaults to 1/30 of the range of the data. Pick better value with `binwidth`.
Smoothing formula not specified. Using: y ~ x

This looks okay, but maybe we wanted the tree plots to be on the top and the sleigh plots to be on the bottom. We can do that but specifying which ones go in the top row and which on the bottom.

Want a review? Try changing your axis labels to something cleaner and more informative (pretend you were going to publish this image in a paper!). Hint: you will need to edit the labels in your original plots, not in the plot_grid(). If you can’t remember how, check out day 10.

DAY 24

On the twenty-fourth day of Christmas…

…we put our knowledge to the test!

Today we will not learn anything new. Instead, we will bring together what we’ve learned to build some plots!

We will build three different plots:

  1. Build a plot using the trees dataset that shows violin plots by tree type, also coloured by tree type. Add a title, informative axis labels, and a legend at the bottom.

  2. Build a plot using the sleighs dataset that shows deerpower by weight, with a trendline, points coloured by km per carrot (with a size and shape where you can see the colour differences), and an informative title, axis labels, and legend.

  3. Build an image that shows the previous two graphs side-by-side in one image with labels A and B.

And as a little Christmas Eve gift, here is a fantastic cheatsheet for ggplot2. I keep it in my bookmarks bar :) https://github.com/rstudio/cheatsheets/blob/main/data-visualization-2.1.pdf - this might come in handy while you are working through these exercises!

DAY 25 - MERRY CHRISTMAS!!!

I hope you enjoyed this advent calendar. Similarly to the previous one, for day 25, I’ve given you code for a Christmas visual created by someone else (in this case, data scientist, Jodie Burchell). But unlike in the original R advent calendaR, now you should be able to understand a lot of the components and the grammar used in this code. Load the data and run the code to see what happens! The original blog post can be found here: https://t-redactyl.io/blog/2016/12/a-very-ggplot2-christmas.html

First, load in this dataset, which is available through git hub.

ChristmasTree <- read.csv("https://raw.githubusercontent.com/t-redactyl/Blog-posts/master/Christmas%20tree%20base%20data.csv")
# Generate the "lights"
Desired.Lights <- 50
Total.Lights <- sum(round(Desired.Lights * 0.35) + round(Desired.Lights * 0.20) + 
                      round(Desired.Lights * 0.17) + round(Desired.Lights * 0.13) +
                      round(Desired.Lights * 0.10) + round(Desired.Lights * 0.05))

Lights <- data.frame(Lights.X = c(round(runif(round(Desired.Lights * 0.35), 4, 18), 0),
                                       round(runif(round(Desired.Lights * 0.20), 5, 17), 0),
                                       round(runif(round(Desired.Lights * 0.17), 6, 16), 0),
                                       round(runif(round(Desired.Lights * 0.13), 7, 15), 0),
                                       round(runif(round(Desired.Lights * 0.10), 8, 14), 0),
                                       round(runif(round(Desired.Lights * 0.05), 10, 12), 0)))
Lights$Lights.Y <- c(round(runif(round(Desired.Lights * 0.35), 4, 6), 0),
                          round(runif(round(Desired.Lights * 0.20), 7, 8), 0),
                          round(runif(round(Desired.Lights * 0.17), 9, 10), 0),
                          round(runif(round(Desired.Lights * 0.13), 11, 12), 0),
                          round(runif(round(Desired.Lights * 0.10), 13, 14), 0),
                          round(runif(round(Desired.Lights * 0.05), 15, 17), 0))
Lights$Lights.Colour <- c(round(runif(Total.Lights, 1, 4), 0))

# Generate the "baubles"
Baubles <- data.frame(Bauble.X = c(6, 9, 15, 17, 5, 13, 16, 7, 10, 14, 7, 9, 11, 
                                   14, 8, 14, 9, 12, 11, 12, 14, 11, 17, 10))
Baubles$Bauble.Y <- c(4, 5, 4, 4, 5, 5, 5, 6, 6, 6, 8, 8, 8, 8, 10,
                      10, 11, 11, 12, 13, 10, 16, 7, 14)
Baubles$Bauble.Colour <- factor(c(1, 2, 2, 3, 2, 3, 1, 3, 1, 1, 1, 2, 1, 2,
                                  3, 3, 2, 1, 3, 2, 1, 3, 3, 1))
Baubles$Bauble.Size <- c(1, 3, 1, 1, 2, 1, 2, 2, 2, 1, 1, 1, 3, 3, 3,
                         2, 3, 1, 1, 2, 2, 3, 3, 2)

# Generate the plot
ggplot() + 
  geom_tile(data = ChristmasTree, aes(x = Tree.X, y = Tree.Y, fill = Tree.Colour)) +
  scale_fill_identity() + 
  geom_point(data = Lights, aes(x = Lights.X, y = Lights.Y, alpha = Lights.Colour),
             colour = "lightgoldenrodyellow", shape = 16) +
  geom_point(data = Baubles, aes(x = Bauble.X, y = Bauble.Y, colour = Bauble.Colour, size = Bauble.Size),
             shape = 16) +
  scale_colour_manual(values = c("firebrick2", "gold", "dodgerblue3")) +
  scale_size_area(max_size = 12) +
  theme_bw() +
  scale_x_continuous(breaks = NULL) + 
  scale_y_continuous(breaks = NULL) +
  geom_segment(aes(x = 2.5, xend = 4.5, y = 1.5, yend = 1.5), colour = "blueviolet", size = 2) +
  geom_segment(aes(x = 5.5, xend = 8.5, y = 1.5, yend = 1.5), colour = "dodgerblue3", size = 2) +
  geom_segment(aes(x = 13.5, xend = 16.5, y = 1.5, yend = 1.5), colour = "blueviolet", size = 2) +
  geom_segment(aes(x = 17.5, xend = 19.5, y = 1.5, yend = 1.5), colour = "dodgerblue3", size = 2) +
  geom_segment(aes(x = 3.5, xend = 3.5, y = 0.5, yend = 2.5), colour = "blueviolet", size = 2) +
  geom_segment(aes(x = 7.0, xend = 7.0, y = 0.5, yend = 2.5), colour = "dodgerblue3", size = 2) +
  geom_segment(aes(x = 15.0, xend = 15.0, y = 0.5, yend = 2.5), colour = "blueviolet", size = 2) +
  geom_segment(aes(x = 18.5, xend = 18.5, y = 0.5, yend = 2.5), colour = "dodgerblue3", size = 2) +
  annotate("text", x = 11, y = 20, label = "Merry Christmas!",
           size = 12) +
  labs(x = "", y = "") +
  theme(legend.position = "none")

---
title: "The ggplot advent calendaR!"
output: html_notebook
---

Here it is: the ggplot advent calendaR! The "gg" in ggplot refers to the grammar of graphics. For the next 25 days, we will go through an introduction to the grammar of graphics, make a lot of visualizations (some good, some bad), and learn some of the basic functions and features of the ggplot2 package. **NOTE** I am no expert in ggplot. I literally learned while creating this tutorial, and that was a big motivation in doing this. If there are errors or smoother ways to do the same thing, please let me know! R is about constant learning and improving, ggplot is no different.

DAY 1

On the first day of Christmas... we're jumping into the tidyverse!

ggplot is part of the tidyverse, a group of packages that also includes dplyr, readr, and other very helpful packages that you should have! You can install and load ggplot separately, but… why? (:

The package we're using is actually called ggplot2. Super-duper fun fact: "ggplot2 is called ggplot2 because once upon a time there was just a library ggplot. However, the developer noticed that it used an inefficient set of functions. In order for not to break the API, the authors introduced a successor package ggplot2. However, the central function in this package is still called ggplot(), not ggplot2()!" (wasn't that fun? Source: Freeman & Ross, 2019).

Install and load tidyverse:
```{r}
install.packages("tidyverse")
library(tidyverse)
```

Let's also load our data using another tidyverse package, readr, and then view the data. We will be working with two datasets throughout this tutorial:
```{r}
sleighs <-read_csv("https://raw.githubusercontent.com/kiirsti/ggplot_adventcalendaR/main/sleigh.data.csv")
sleighs #if you highlight or retype and run the name "sleighs" it will show you a sample of the data

trees <-read_csv("https://raw.githubusercontent.com/kiirsti/ggplot_adventcalendaR/main/xmas.trees.csv")
trees
```
Using readr and read_csv(), instead of read.csv() from base R, loads our data in as tibbles. Why use tibbles over traditional dataframes? Three reasons: (1) the input types aren't automatically changed when you read in the data, (2) you can keep lists as columns, and (3) you can use non-standard variable names (e.g., starting with a number, as in "1st_place"). Thank you to my friend Pat for this explanation!

DAY 2

On the second day of Christmas...

...we learned the language of ggplot aka the Grammar of Graphics! ggplot is different from base R graphics. How? Base R graphics work on the individual vectors, ggplot works on dataframes (Source: Prabhakaran, 2017). ggplot works by adding layer upon layer to create your visualization. In base R graphics, you put all the information in one code and it spits out a graphic. In ggplot, you build your grahpic with layers.

What are some of the different components of a graphic (Source: Freeman & Ross, 2019)?
- data
- geometric objects (geoms)
- aesthetics
- statistical transformations
- position adjustments
- scale
- coordinate system
- facets
- themes

The first layer of ggplot is always...ggplot. If you run the code below, you will get a blank graphic with a grey background. The grey background is the ggplot default. In base R, you can't run just function, e.g., boxplot(), without a vector in the brackets.

```{r}
ggplot()
```
Now let's add another layer to ggplot. We'll specify the dataset and what we want our x and y axes to be.
```{r}
ggplot(trees, aes(x=type, y=height))
```
When you run this code, you should see that we now have axes and labels on top of our grey background. Note, you can drop the x= and y= and the code will run the same (don't take my word for it, try it yourself!).


DAY 3

On the third day of Christmas...

...we added data to our graphic and explored geom layers!

Geoms, or geometric objects, are graphical representations of the data. There are many many types of geoms (here's a long list of examples: https://ggplot2.tidyverse.org/reference/#geoms). Let's try a few.

Geom layers start with "geom_" and are followed by the type of geometric object, e.g., "geom_point" or "geom_line". Because we're adding a new layer onto our graphic, we use a + after our first line of code. To keep things tidy and easy to read, I usually start my new layer on a new line.

```{r}
ggplot(trees, aes(x=type, y=height))+
  geom_point()
```
Since we are working with categorical data (types of trees), the points all fall on three lines. Let's change our x and y axes before moving on to a different type of geom.

```{r}
ggplot(trees, aes(x=xmas.magic, y=height))+
  geom_point()
```
Now we're comparing two continuous variables (xmas magic and tree height) so the points are scattered across our graphic.

Let's look at some different geom types. We'll go back to type vs. height for this one.

```{r}
ggplot(trees, aes(x=type, y=height))+
  geom_boxplot()
```
When creating graphics, always consider the type of data you're working with (e.g., continuous vs. discrete). The type of data you're working with should determine the type of geom you choose. Some geoms won't run properly if the type of data you're inputting doesn't work with those data types.

One more example of a geom layer using the trees data.

```{r}
ggplot(trees, aes(x=type, y=height))+
  geom_violin()
```

DAY 4

On the fourth day of Christmas...

...we introduced pipes!

This is a bit of a segue from ggplot and you certainly don't need to use pipes to create beautiful ggplot graphics BUT it might make your data look more tidy.

The pipe operator (%>%) is included in the tidyverse, so if you loaded the tidyverse, then you've got access to %>%. From R for Data Science, "Pipes are a powerful tool for clearly expressing a sequence of multiple operations...The point of the pipe is to help you write code in a way that is easier to read and understand." In words, %>% means "and then". By using a pipe, you're telling R to run the first line of code AND THEN run the next line of code.

Why do I bring this up now? Previously, we ran this code to produce a boxplot showing tree heights:

```{r}
ggplot(trees, aes(x=type, y=height))+
  geom_boxplot()
```

Often, in online examples and tutorials, you will see ggplot codes written with the dataframe listed first, then a pipe leading into the ggplot code. If you run the code below, you'll see that we can produce the exact same graphic:
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot()
```
Another tip from my friend Pat, "The most useful application of the pipe is to plug the result of one function into another without creating intermediate data frames" Here's an example of what this might look like:

my_data %>%
  function1(arguments, etc) %>%
  function2(arguments, etc)


DAY 5

On the fifth day of Christmas...

...we started working with the aesthetics of our graphic!

Aesthetic mapping describes the visual properties of the graphic. We can change the aesthetics of ggplot() or each layer we add to it. We use aes() and then customize our graphic to appear how we'd like it. On Day 3, we created some pretty boring graphs: white boxes or violins, black dots, and grey backgrounds. While this is fine for exploring data, it's perhaps not how we want our finished product to look!

Today, we're going to work on changing the colours of our data. Let's start with the scatterplot, i.e., the geom_point() graphic.

To change the aesthetics of the geom_point() dots, we add "aes()" within the parentheses of geom_point(). You may have noticed that we already used "aes()" when we told ggplot what we wanted our X and Y axes to be. Now we will also tell ggplot to assign different colours to our points by tree type. In this case, we are not choosing the colours.

```{r}
trees %>%
ggplot(aes(x=type, y=height, colour=type))+
  geom_point()
```
Now instead of changing the aesthetics of the ggplot() layer, let's instead change the aesthetics of the geom_point() layer.
```{r}
trees %>%
ggplot()+
  geom_point(aes(x=type, y=height, colour=type))
```
Looks the same right? So what's the difference? Right now, nothing appears different, but later it may impact how your graphic looks. When you change the aesthetics of the ggplot() layer, those aesthetics will be applied as the default to all of your subsequent layers. If you change the aesthetics of an individual layer, it will only be applied to that layer and it will override the default.

What if we want to use all the same colour for our points? Find out tomorrow!

DAY 6

On the sixth day of Christmas...

...we continued with aesthetic mapping and colours!

So you want all your points to be one colour. Sounds easy-peasy right? Not quite.

Here's why it's a bit confusing. I had to do a bit of digging to understand exactly why this is...

When we want to map a variable of our data (e.g., telling ggplot we want to colour by tree type), we put aes() inside the geom_point(). If we want to apply a constant colour (constant value) to our points (e.g., telling ggplot we want all our points to be blue), we put aes() OUTSIDE geom_point). Try both the codes below to see what happens. 

```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_point(aes(colour="blue"))

trees %>%
ggplot(aes(x=type, y=height))+
  geom_point(aes(), colour="blue")
```
Note: in this case you could also leave out the aes() within geom_point() and it would run the same. 

For more of an explanation on why aes() works this way, you can check out this helpful thread on stackoverflow that helped me: https://stackoverflow.com/questions/41863049/when-does-the-argument-go-inside-or-outside-aes. Here's another really fantastic resource: https://drive.google.com/file/d/1Dvul1p6TYH6gWJzZRwpE0YX1dO0hDF-b/view.

DAY 7

On the seventh day of Christmas...

...we changed the colours of boxplots!

Try running the same code from yesterday but on a geom_boxplot() graphic.
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(), colour="blue")
```
Hmm... okay. But what if we wanted the inside of the boxes to be coloured, not the outline?

Instead of "colour =" we use "fill =" instead.
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(), fill="blue")
```

This works for points too. The default for geom_point() are solid points, but you can change these to points with different outline and fill colours. 
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_point(aes(), shape = 21, fill="green4", colour="red")
```
We can also change the colours of our boxplots by tree type, as we did with the points on Day 5. Remember from yesterday that because we want to change the colours by a variable (i.e., multiple colours determind by the levels of the variable) rather than change using a constant value (i.e., single colour), we put this INSIDE aes().

```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")
```

Now that we're playing around with colours. It's time I introduce you to a good colour resource: http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf. There are many great online resources about colours, but I like this one made by Dr. Ying Wei.

DAY 8

On the eight day of Christmas...

...we finally took a break from talking about colours!

Colour is a big one, but let's work with some other changes to the aesthetics of our graphic. Let's bring back the geom_point() graphic. I'm using the theme of Christmas, but of course, December has many, many holidays. Hanukkah starts on Dec. 17 this year, so let's make a Hanukkah-inspired graphic and look at different ways to change the aesthetics of our graphs.


```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_point(aes(), shape = 11, colour="blue", size = 6, stroke=2)
```

Here we've specified the shape, the colour, the size, and the stroke (line thickness) of the points. There are many changes we can make to aesthetics, these are just a few examples. Let's take a look at a line graph and how we can customize that.

We haven't made a line graph yet, so let's create a simple one first, before we customize it.

```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line()
```
Here we've specified that we want each line to represent a different tree type by using "group=". This goes inside aes() because it is being applied to the data (i.e., it can't be done without this specific dataset)

Now let's spruce it up. ("spruce" it up... because they're trees... hahahaha. A little Christmas cheer for you).
```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(aes(colour=type), linetype="dashed")
```
What if we want to show the points AND the lines? Tomorrow we will add another layer to our graphic.

DAY 9

On the ninth day of Christmas...

... we added another geom_ layer to our graphic.

Here's our geom_lin() graphic from yesterday:
```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(aes(colour=type), linetype="dashed")
```

Now let's add points to it as well. It's as simple as + geom_point()!
```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(aes(colour=type), linetype="dashed")+
  geom_point()
```

And if you want to change the aesthetics of those points, where do you do it? Within geom_point()!
```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(aes(colour=type), linetype="dashed")+
  geom_point(aes(colour=type), size = 3, shape = 8)
```

They kind of look like Christmas lights :)

DAY 10

On the tenth day of Christmas...

...we edited the text of our graphics.

Editing the look of your text and fonts in ggplot is easy, with lots of options to make it look exactly how you want it. Here's our graphic from yesterday.
```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(aes(colour=type), linetype="dashed")+
  geom_point(aes(colour=type), size = 3, shape = 8)
```
What kind of things might we want to change? Maybe we want a title and more informative and cleaner-looking axis labels. To do this, we need to add new layer, labs() for labels. You can also use xlab(), ylab(), and ggtitle() to add them individually.

```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(aes(colour=type), linetype="dashed")+
  geom_point(aes(colour=type), size = 3, shape = 8)+
    labs(title="Christmas Trees", x="Tree height", y="Christmas magic")
```
What if we want to make more aesthetic changes to our fonts and labels? We will need to use themes!

DAY 11

On the eleventh day of Christmas...

...we introduced themes!

We can use themes to make finer adjustments to non-data parts of our graphic. While labs() is fine for adding labels and a title, themes allow us to choose the size, font, colour, position, etc. of that text. Today I'm going to introduce you to some complete themes.

Let's go back to our nice, clean-looking boxplots:
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")
```
While the grey backgronud is the default in ggplot, it's certainly not a requirement. That's often one of the first things I change!

```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")+
  theme_bw()
```
Theme_bw() is a complete theme, meaning that it's a theme that can be applied as a layer that changes the look of your overall plot. Theme_gre() is the default. You can also try other complete themes, such as theme_dark(), theme_light(), theme_classic(), theme_void() and more.

```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")+
  theme_classic()
```
Personally, I like this one best!


DAY 12

On the twelfth day of Christmas...

... we worked with theme() and more control with non-data parts of our graphics!

Here, I'm keeping theme_classic() and building on top of that.
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")+
  labs(title="Christmas Trees", x=NULL, y="Tree height")+
  theme_classic()+
  theme(plot.title = element_text(face = "bold", size = 16, colour = "gold"),
    legend.background = element_rect(fill = "white", size = 4, colour = "red"),
    legend.justification = c(0, 1),
     axis.ticks = element_line(),
    axis.title.y = element_text(colour="darkgreen"),
    axis.text.x = element_text(size=12),
    panel.grid.major = element_line())
```
In this case, I wanted to show you how you can keep the grid lines, while changing to theme_classic(). We've also changed the style and size of the title, moved the legend and changed the colours, changed the colour of the y axis label, and changed the size of the x axis tick labels. This is by no means a pretty graphic, but hopefully it gives you an idea of different ways we can change features of our graphic. There are many more changes you could make and lots of great online tutorials that cover this in more detail.

DAY 13

On the thirteenth day of Christmas...

... we introduced scales!

To do this, we need to introduce scales. We'll be working with scales for a few days - get excited! Scales allow us to override defaults. Similar to themes, scales allow us more control over what our graphic looks like, but scales focus on changing the look of the data.

Let's start with position scales. The most commonly used are scale_x_continuous() and scale_y_continuous(). Since we are working with categorical data right now (tree types), we could swap out scale_x_continuous() for scale_x_discrete(). Using these, we can set the limits of our scales. We don't need to change the limits of our discrete (x) axis, but let's change the limits of our y axis.

Here's our boxplot graphic, but I've changed our y axis to Christmas magic instead of tree height:
```{r}
trees %>%
ggplot(aes(x=type, y=xmas.magic))+
  geom_boxplot(aes(fill=type), colour="black")+
  labs(title="Christmas Trees", x=NULL, y="Christmas magic")+
  theme_classic()
```

Now let's try changing the y-axis limits:
```{r}
trees %>%
ggplot(aes(x=type, y=xmas.magic))+
  geom_boxplot(aes(fill=type), colour="black")+
  labs(title="Christmas Trees", x=NULL, y="Christmas magic")+
  theme_classic()+
  scale_y_continuous(limits=c(0,15))
```
Note: you can also use lims(x=c(#,#), y=c(#,#)) (replacing the #s with your desired limits). This is simpler and faster, but we will stick with scale_y_continuous() because we will make additional adjustments below.

When we increased our y-axis limits, it changes the labels of our y-axis ticks to include 0.5s. Maybe we'd rather have whole numbers or maybe just fewer numbers altogether. We can specify this using the same scale_y_continuous() but adding "breaks=".

```{r}
trees %>%
ggplot(aes(x=type, y=xmas.magic))+
  geom_boxplot(aes(fill=type), colour="black")+
  labs(title="Christmas Trees", x=NULL, y="Christmas magic")+
  theme_classic()+
  scale_y_continuous(limits=c(2,12), breaks=c(2, 7, 12))
```
If we don't want any breaks we can specify by using "breaks=NULL".

Finally, we can modify the axis tick labels. Let's change the names of our x-axis tick labels.
```{r}
trees %>%
ggplot(aes(x=type, y=xmas.magic))+
  geom_boxplot(aes(fill=type), colour="black")+
  labs(title="Christmas Trees", x=NULL, y="Christmas magic")+
  theme_classic()+
  scale_y_continuous(limits=c(2,12), breaks=c(2, 7, 12))+
  scale_x_discrete(labels=c("Balsam Fir", "Jack Pine", "Blue Spruce"))
```
There's much much more that can be done with position scales. I suggest taking a look at Ch. 10 of this book by Hadley Wickham, Danielle Navarro, and Thomas Lin Pedersen, which I relied on heavily for this section. https://ggplot2-book.org/scale-position.html#scale-position

DAY 14

On the fourteenth day of Christmas...

... we worked with colour scales!

We've been seeing the same three colours over and over: red, green, blue. But what if we want to specify the colours when we choose colour or fill by type? This is where colour scales come in.

We can choose a different colour palettes by installing colour palette packages and loading them.

Here is a good resource, RColorBrewer.
```{r}
install.packages("RColorBrewer")
library(RColorBrewer)
```

Let's choose a palette from RColorBrewer. You can find the names and palettes here: https://r-graph-gallery.com/38-rcolorbrewers-palettes.html or you can run this code to display them in R:
```{r}
RColorBrewer::display.brewer.all()
```
Here we'll create our boxplots again, so we will want to change the "fill" rather than the "colour", therefore we use scale_fill_brewer(). We're going to try the Dark2 palette:
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")+
  theme_classic()+
  scale_fill_brewer(palette="Dark2")
```

What if you like the palette but don't like how ggplot applied the colours? Or maybe you can't find the perfect palette and want to create your own? We can manually assign colours too... but we'll wait until tomorrow for that one!

DAY 15

On the fifteenth day of Christmas...

...we learned how to assign colours manually!

For this we use scale_color_manual() and/or scale_fill_manual(). For the boxplots, we will use scale_fill_manual().
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")+
  theme_classic()+
  scale_fill_manual(values=c("darkgreen", "firebrick2", "mediumseagreen"))
```
The colours will be assigned in the order we gave them, so you can also repeat a colour (e.g., green, red, green) and it will be assigned in that order. Instead of colour names, you can also use the color codes (e.g., #E69F00).

We'll be using these colours again and again, so why don't we save them as a vector?
```{r}
xmas <-c("darkgreen", "firebrick2", "mediumseagreen")
```


You can also specific the colours that will be assigned to each level of your variable:
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")+
  theme_classic()+
  scale_fill_manual(values=xmas)
```

Christmas colours aren't necessarily colour-blind friendly. There are lots of fantastic resources when considering colour-blind friendly palettes for your graphics. Here's one: https://colorbrewer2.org/#type=sequential&scheme=YlGnBu&n=9

DAY 16

On the sixteenth day of Christmas...

...we worked with a different type of plot and added multiple geoms to one plot!

We haven't even touched the sleigh dataset! Let's create a graphic with that so we can work with a different type of plot.
```{r}
sleighs %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3)+
  theme_classic()
```
Here we've told ggplot to colour each sleigh type a different colour (some are nearly impossible to differentiate but we won't worry about that today).

Maybe we want to add a trendline to our plot. We can do this by adding a geom_smooth() layer:
```{r}
sleighs %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3)+
  geom_smooth()+
  theme_classic()
```
And maybe we want to customize that line. The default is a blue line with grey background. We can change that to a black line using "colour =" and we can change the transparency of the error using "alpha =". We can also change the method of how the line is calculated using "method ="
```{r}
sleighs %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3)+
  geom_smooth(colour="black", alpha=0.2, method=lm)+
  theme_classic()
```
Alpha can also be helpful when you have overlapping points.
```{r}
sleighs %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3, alpha=0.5)+
  geom_smooth(colour="black", alpha=0.2)+
  theme_classic()
```
More detailed and additional information on colour scales can be found here: https://ggplot2-book.org/scale-colour.html


DAY 17

On the seventeenth day of Christmas...

...we edited our legend!

Want to move your legend? Change the shape, style, size? Get rid of it completely?

Let's go back to our boxplots.We can remove the legend using "show.legend = FALSE". We put this in the geom_boxplot() since the legend is linked to that layer.
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black", show.legend = FALSE)+
  theme_classic()+
  scale_fill_manual(values=xmas)
```
Another option is to use guides(). In this case we use fill="none" but if we were working with colour instead of fill, we would type colour="none". This way of removing the legend is handy in instances when you have multiple geoms. We can add one little line of code to remove the legend, instead of typing "show.legend=F" into each geom layer.
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")+
  theme_classic()+
  scale_fill_manual(values=xmas)+
  guides(fill="none")
```

We can also change the text of the legend. When we changed the x-axis labels, the legend didn't change with it. Let's fix that now. We do this by adding the same "labels = c(..." in scale_fill_manual() as well as scale_x_discrete(). Why? Because scale_fill_manual() refers to the colours of your data, and the legend represents that (they are directly linked). scale_x_discrete() is focused solely on the x-axis.
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")+
  theme_classic()+
  scale_fill_manual(name="Tree Type", values=xmas, labels=c("Balsam Fir", "Jack Pine", "Blue Spruce"))+
    scale_x_discrete(labels=c("Balsam Fir", "Jack Pine", "Blue Spruce"))
```

*REFRESHER* What if we want to change the position of the legend or the colour of the text?? Remember back to when we talked about themes? If we want to make changes to anything that's not related to the data (i.e., it could be a plot of anything or one without any data in it), we use THEMES.

```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_boxplot(aes(fill=type), colour="black")+
  theme_classic()+
  scale_fill_manual(name="Tree Type", values=xmas, labels=c("Balsam Fir", "Jack Pine", "Blue Spruce"))+
    scale_x_discrete(labels=c("Balsam Fir", "Jack Pine", "Blue Spruce"))+
  theme(legend.title = element_text(colour = "goldenrod3", size =14, face="bold"), legend.position="top")
```

DAY 18

On the eighteenth day of ChRistmas...

... we learned about guides!

Yesterday, we made some edits to our legend using scales and themes. Today, we will introduce one more way to exercise more fine control over your graphics: guides() and guide_ functions! Guides, like our legends and axes, help us or our audience interpret our plots. We can use guides() or the guide_ argument _*() functions to make additional changes to our legends and axes. Here's a great explanation of scales and guides from Ch.15 of Wickham, Navarro and Pederson's book, which I highly recommend you check out: https://ggplot2-book.org/index.html

"Formally, each scale is a function from a region in data space (the domain of the scale) to a region in aesthetic space (the range of the scale). The axis or legend is the inverse function, known as the guide: it allows you to convert visual properties back to data. You might find it surprising that axes and legends are the same type of thing, but while they look very different they have the same purpose: to allow you to read observations from the plot and map them back to their original values."

Let's look at some examples. Let's try a new graphic with our data, and we'll use a gradient colour scale to colour our points based on the amount of "Christmas magic" in our trees:
```{r}
trees %>%
ggplot(aes(x=needle.drop, y=height))+
  geom_point(aes(colour=xmas.magic), size=2)+
  theme_classic()+
  scale_colour_continuous(low="red", high="mediumseagreen")
```

Before we get back to guides, let's quikcly chat about the gradient scale. There are many, many ways you can edit the colours, but in this case we told ggplot that we wanted to change the colour of our points with a gradient "scale_colour_continuous()" and then we set the high and low colours. We could have also set the middle colour or chosen an existing gradient. Learn more here: https://ggplot2-book.org/scale-colour.html

Back to guides! (colours are just so distracting!!)

We can make additional edits to our legends using "+ guides()" or by specifying the "guide = " argument within our scale layer (scale_colour_continuous(), which corresponds with our legend).
```{r}
trees %>%
ggplot(aes(x=needle.drop, y=height))+
  geom_point(aes(colour=xmas.magic), size=2)+
  theme_classic()+
  scale_colour_continuous(low="red", high="mediumseagreen")+
  guides(colour=guide_colourbar(reverse=TRUE, direction = "horizontal", barheight=unit(2, "cm")))
```
Here we've flipped our bar horizontally and increased the size of the legend. No changes have been made to the rest of our graphic. We can achieve the exact same output by adding "guide = " to our scale_colour_continuous() layer.
```{r}
trees %>%
ggplot(aes(x=needle.drop, y=height))+
  geom_point(aes(colour=xmas.magic), size=2)+
  theme_classic()+
  scale_colour_continuous(low="red", high="mediumseagreen", guide = guide_colourbar(reverse=TRUE, direction = "horizontal", barheight=unit(2, "cm")))
```
Here are a couple more ways we can use guides to edit our legend. Let's change up our plot a bit.
```{r}
trees %>%
ggplot(aes(x=xmas.magic, y=needle.drop))+
  geom_point(aes(colour=type), size=2.5)+
  theme_classic()+
  scale_colour_manual(values=xmas)
```
There's a bit of overlap in our points, so let's adjust the transparency using alpha:
```{r}
trees %>%
ggplot(aes(x=xmas.magic, y=needle.drop))+
  geom_point(aes(colour=type), size=2.5, alpha=0.7)+
  theme_classic()+
  scale_colour_manual(values=xmas)
```
But maybe we don't want our legend to also have transparent points, so we can use a guide to override this aesthetic change.

```{r}
trees %>%
ggplot(aes(x=xmas.magic, y=needle.drop))+
  geom_point(aes(colour=type), size=2.5, alpha=0.7)+
  theme_classic()+
  scale_colour_manual(values=xmas)+
  guides(colour=guide_legend(override.aes=list(alpha=1)))
```
Finally, let's use guides to change the aesthetics of our axes. We will go back to our sleighs dataset for this one.
```{r}
sleighs %>%
ggplot(aes(x=name, y=deerpower))+
  geom_point()+
  labs(x="Sleighs")+
  theme_classic()
```
As you can see, the names of the sleighs are impossible to read. Let's flip the labels at the bottom so that they run vertically instead.
```{r}
sleighs %>%
ggplot(aes(x=name, y=deerpower))+
  geom_point()+
  theme_classic()+
  labs(x="Sleighs")+
  guides(x=guide_axis(angle=90))
```
Much better! Hopefully now you have an idea of some of the ways you can edit your guides (legends, axes) using guides() and guide =.

Day 19

On the nineteenth day of Christmas...

... we made position adjustments!

Position adjustments are handy if you have overlapping geoms or data. You can override the default using the position argument in the geom_() functiions.

Instead of boxplots, let's look at the raw data points using a different type of geom later, geom_jitter().
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_jitter(aes(colour=type))+
  theme_classic()+
  scale_colour_manual(values=xmas)

```
Position adjustments come in handy with point data like this, more so when we're working with large datasets that have many points. Let's adjust the position of our jittered points.
```{r}
trees %>%
ggplot(aes(x=type, y=height))+
  geom_jitter(aes(colour=type), position = position_jitter(width = 0.05, height = 0.5))+
  theme_classic()+
  scale_colour_manual(values=xmas)
```
Let's look at another example. We'll bring back our scatterplot of sleigh data but I'm going to cut it down a bit to make it a easier to work with. I'll do this using another tidyverse package, dplyr.

```{r}
sleighs.transposed <-t(sleighs)
sleighs.subset <-slice(sleighs, 4:12)
```

```{r}
sleighs.subset %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3)+
  theme_classic()
```
That legend is fine, but let's get rid of it and instead label each point.Do you remember how to remove the legend?
```{r}
sleighs.subset %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3, show.legend = F)+
  geom_text(aes(label=name))+
  theme_classic()
```
Well this might work better, but the labels are all overlapping and difficult to read. This is where position_nudge() comes in handy!

```{r}
sleighs.subset %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3, show.legend = F)+
  geom_text(aes(label=name), position = position_nudge(x=10, y=0.4))+
  theme_classic()
```
And because our Stealth Sleigh is off the plot, let's fix that using what we learned on Day 13 about limits.
```{r}
sleighs.subset %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3, show.legend = F)+
  geom_text(aes(label=name), position = position_nudge(x=10, y=0.4))+
  theme_classic()+
  scale_x_continuous(limits=c(0,300))
```
DAY 20

On the twentieth day of Christmas...

... we did some position adjustments with bar plots!

First, let's create a barplot since we haven't done that yet. We'll base it on our previous sleighs subset.
```{r}
sleighs.subset %>%
  ggplot(aes(x=bells, y=reins))+
  geom_col()+
  scale_x_discrete(limits=c(4, 6, 8))+
  theme_classic()
```
The sleighs come with 4, 6, or 8 bells. So here we're displaying the counts of sleighs in each category (# of bells). But maybe we want to see some additional information in our barplot, such as the number of reins on the sleigh. I've heard that Santa takes these things super seriously, so this is completely practical and reasonable plot. Note that we have to specify that reins is a categorical variable, not a continuous one, using as.character(). In this case, we can't have 3.5 reins.
```{r}
sleighs.subset %>%
  ggplot(aes(x=bells, y=reins, fill=as.character(reins)))+
  geom_col()+
  scale_x_discrete(limits=c(4, 6, 8))+
  theme_classic()+
  scale_fill_manual(values=xmas)
```
The default is a stacked barplot (position = "stack"), but there are other ways we could display this using position adjustments. This option shows it as a percent using position = "fill".
```{r}
sleighs.subset %>%
  ggplot(aes(x=bells, y=reins, fill=as.character(reins)))+
  geom_col(position="fill")+
  scale_x_discrete(limits=c(4, 6, 8))+
  theme_classic()+
  scale_fill_manual(values=xmas)
```
We can also position them side by side using position = "dodge". Note that the red bar on the left and the green bar on the right are two bars side by side.
```{r}
sleighs.subset %>%
  ggplot(aes(x=bells, y=reins, fill=as.character(reins)))+
  geom_col(position="dodge")+
  scale_x_discrete(limits=c(4, 6, 8))+
  theme_classic()+
  scale_fill_manual(values=xmas)
```
Want a review? Try changing the name of your legend.

DAY 21

On the twenty-first day of Christmas...

...we learned about faceting!

Faceting produces smaller graphs that can be displayed alongside one another. We use facet_wrap() and facet_grid() for this. 

Let's start with facet_wrap(). Remember our line graph that looks a bit like Christmas lights? Let's use that. Here it is, as a reminder:
```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(aes(colour=type), linetype="dashed")+
  geom_point(aes(colour=type), size = 3, shape = 8)
```
Now, instead of having all three lines on one plot, let's create three smaller plots and display them together. 

```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(linetype="dashed")+
  geom_point(size = 3, shape = 8)+
  facet_wrap(~type, ncol=3)
```
Let's do the same thing but using facet_grid(). The syntax is a little different, but we've produced the exact same set of plots. In our case, ".~type" puts the plots side by side.
```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
 geom_line(linetype="dashed")+
  geom_point(size = 3, shape = 8)+
  facet_grid(.~type)
```
If we want to stack our plots instead, we change up the coding within facet_grid(). In our case, "type~." stacks the plots.
```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
 geom_line(linetype="dashed")+
  geom_point(size = 3, shape = 8)+
  facet_grid(type~.)
```
Our datasets aren't really set up for this type of grid, but let's look at plots of reins by bells to show you how you could set up facet_grid() with multiple plots. A plot area is produced with two levels for reins and three levels for bells.
```{r}
sleighs.subset%>%
  ggplot(aes(x=reins, y=bells))+
  facet_grid(reins~bells)
```
You can also use "scales =" to adjust the scales of all or each of the plots. Let's go back to our first set of plots from today:
```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(linetype="dashed")+
  geom_point(size = 3, shape = 8)+
  facet_wrap(~type, ncol=3)
```
To adjust the scales, we add "scales =" to facet_wrap().
```{r}
trees %>%
ggplot(aes(x=height, y=xmas.magic, group=type))+
  geom_line(linetype="dashed")+
  geom_point(size = 3, shape = 8)+
  facet_wrap(~type, ncol=3, scales = "free_y")
```
By choosing "free_y" we have changed the y scale from fixed to free but the x-axis remained the same. Take a look at how the y-axis scale is now different for each plot. Other options are "free_x", "fixed", or "free".

There are lots more things you can do with facets. Check out Ch. 17 of "ggplot2:Elegant Graphics for Data Analysis" for more: https://ggplot2-book.org/facet.html

Want a review? Try changing the style of the points and lines in these plots. Also, try removing the grey background!

DAY 22

On the twenty-second day of Christmas...

...we annotated our plots!

Sometimes we may want to add text to our plots, not just titles and labels, but annotations on the data or plot area. We can do this using geom_text(). We did this on Day 19 when we talked about position adjustments. Now we're going to discuss annotations in more detail. Let's bring back that graphic, but with some added text and labels.
```{r}
sleighs.subset %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3, show.legend = F)+
  geom_text(aes(label=name), position = position_nudge(x=10, y=0.4))+
  theme_classic()+
  scale_x_continuous(limits=c(0,300))+
  labs(x="Deerpower", y="Kilometers per carrot", title="Top options for Santa's sleigh upgrade", subtitle= "Source: The Ultimate Sleigh Catalogue 2022")
```
Let's say that the elf in charge wants to send this to Santa but wants to mark, on the plot, which sleigh is her top choice for Santa. We can do this using the annotate() function. We first need to set up our x & y ranges and our caption text.

```{r}
yrng <- range(sleighs.subset$km_per_carrot)
xrng <- range(sleighs.subset$deerpower)
caption <- paste(strwrap("Sprinkle's top choice: Winter Express"))
```

Then we can make our plot:
```{r}
sleighs.subset %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3, show.legend = F)+
  geom_text(aes(label=name), position = position_nudge(x=10, y=0.4))+
  theme_classic()+
  scale_x_continuous(limits=c(0,300))+
  annotate(geom = "text", x = xrng[1], y = yrng[2], 
    label = caption, hjust = -0.4, vjust = 1, size = 5, fontface="bold")
```

Or may we want to annotate the points directly to show Santa which ones Sprinkle chose as her top recommendations.

```{r}
sleighs.subset %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
  geom_point(aes(fill=name), shape=21, size=3, show.legend = F)+
  geom_text(aes(label=name), position = position_nudge(x=10, y=0.4))+
  theme_classic()+
  scale_x_continuous(limits=c(0,300))+
  annotate("text", x=c(55), y=c(21.6), label=c("Top choice!"), size = 7, fontface="bold")
```

Hopefully it's clear which point we're referring to here ("Winter Express") but in case it's not, we can also highlight it by adding another geom_point() layer. Note that we add additional geom_point() layers but they must go before our original geom_point layer or the orange dots will appear on top of the coloured points (which in this case would also be fine!).
```{r}
sleighs.subset %>%
  ggplot(aes(x=deerpower, y=km_per_carrot))+
   geom_point(data=filter(sleighs.subset, name =="Winter Express"), colour="orange", size=6)+
  geom_point(aes(fill=name), shape=21, size=3, show.legend = F)+
  geom_text(aes(label=name), position = position_nudge(x=10, y=0.4))+
  theme_classic()+
  scale_x_continuous(limits=c(0,300))+
annotate("text", x=c(55), y=c(21.6), label=c("Top choice!"), size = 7, fontface="bold")
```
There's much more you can do with annotations. As usual, I'll direct you to ggplot2: Elegant Graphics for Data Analysis: https://ggplot2-book.org/annotations.html#direct-labelling

Want a review? Try changing the colour palette of the points in this plot.

DAY 23

On the twenty-third day of Christmas...

...we introduced cowplot!!

"Cowplot????????:

Yes. Cowplot.

Cowplot is an add-on to ggplot and allows us to combine several plots into one.

"How is that different from faceting??"

Cowplot allows you to combine plots of different types into one image!

First, let's install and load the cowplot package.
```{r}
install.packages("cowplot")
library(cowplot)
```

Now let's make a few graphs.
```{r}
sleigh.plot1 <-sleighs.subset %>%
  ggplot(aes(x=km_per_carrot, y=bag_space))+
  geom_point(colour="darkgreen")+
  theme_classic()

sleigh.plot2 <-sleighs.subset %>%
  ggplot(aes(x=bells))+
  geom_bar(fill="firebrick2")+
  theme_classic()

sleigh.plot3 <-sleighs.subset %>%
  ggplot(aes(x=km_per_carrot, y=deerpower))+
  geom_quantile(colour="gold3")+
  theme_classic()

tree.plot1<-trees %>%
  ggplot(aes(x=type, y=height))+
  geom_boxplot(fill="seagreen")+
  theme_classic()

tree.plot2 <-trees %>%
  ggplot(aes(x=xmas.magic))+
  geom_dotplot(fill="tomato3")+
  theme_classic()
```
Now we can use cowplot to create one image with multiple plots.
```{r}
plot_grid(tree.plot1, tree.plot2, nrow=1, labels = c("A", "B"))
```
Let's combine all 5 plots into one image.
```{r}
plot_grid(tree.plot1, tree.plot2, sleigh.plot1, sleigh.plot2, sleigh.plot3, nrow = 2)
```
This looks okay, but maybe we wanted the tree plots to be on the top and the sleigh plots to be on the bottom. We can do that but specifying which ones go in the top row and which on the bottom.

```{r}
first_row <-plot_grid(plot_grid(tree.plot1, tree.plot2, nrow=1, labels = c("A", "B")))
second_row <-plot_grid(sleigh.plot1, sleigh.plot2, sleigh.plot3, nrow=1, labels = c("A", "B", "C"))

plot_grid(first_row, second_row, ncol=1, nrow=2)
```
Want a review? Try changing your axis labels to something cleaner and more informative (pretend you were going to publish this image in a paper!). Hint: you will need to edit the labels in your original plots, not in the plot_grid(). If you can't remember how, check out day 10.

DAY 24

On the twenty-fourth day of Christmas...

...we put our knowledge to the test!

Today we will not learn anything new. Instead, we will bring together what we've learned to build some plots!

We will build three different plots:

(1) Build a plot using the trees dataset that shows violin plots by tree type, also coloured by tree type. Add a title, informative axis labels, and a legend at the bottom.

(2) Build a plot using the sleighs dataset that shows deerpower by weight, with a trendline, points coloured by km per carrot (with a size and shape where you can see the colour differences), and an informative title, axis labels, and legend.

(3) Build an image that shows the previous two graphs side-by-side in one image with labels A and B.

And as a little Christmas Eve gift, here is a fantastic cheatsheet for ggplot2. I keep it in my bookmarks bar :)
https://github.com/rstudio/cheatsheets/blob/main/data-visualization-2.1.pdf - this might come in handy while you are working through these exercises!

DAY 25 - MERRY CHRISTMAS!!!

I hope you enjoyed this advent calendar. Similarly to the previous one, for day 25, I've given you code for a Christmas visual created by someone else (in this case, data scientist, Jodie Burchell). But unlike in the original R advent calendaR, now you should be able to understand a lot of the components and the grammar used in this code. Load the data and run the code to see what happens! The original blog post can be found here: https://t-redactyl.io/blog/2016/12/a-very-ggplot2-christmas.html

First, load in this dataset, which is available through git hub.
```{r}
ChristmasTree <- read.csv("https://raw.githubusercontent.com/t-redactyl/Blog-posts/master/Christmas%20tree%20base%20data.csv")
```

```{r}
# Generate the "lights"
Desired.Lights <- 50
Total.Lights <- sum(round(Desired.Lights * 0.35) + round(Desired.Lights * 0.20) + 
                      round(Desired.Lights * 0.17) + round(Desired.Lights * 0.13) +
                      round(Desired.Lights * 0.10) + round(Desired.Lights * 0.05))

Lights <- data.frame(Lights.X = c(round(runif(round(Desired.Lights * 0.35), 4, 18), 0),
                                       round(runif(round(Desired.Lights * 0.20), 5, 17), 0),
                                       round(runif(round(Desired.Lights * 0.17), 6, 16), 0),
                                       round(runif(round(Desired.Lights * 0.13), 7, 15), 0),
                                       round(runif(round(Desired.Lights * 0.10), 8, 14), 0),
                                       round(runif(round(Desired.Lights * 0.05), 10, 12), 0)))
Lights$Lights.Y <- c(round(runif(round(Desired.Lights * 0.35), 4, 6), 0),
                          round(runif(round(Desired.Lights * 0.20), 7, 8), 0),
                          round(runif(round(Desired.Lights * 0.17), 9, 10), 0),
                          round(runif(round(Desired.Lights * 0.13), 11, 12), 0),
                          round(runif(round(Desired.Lights * 0.10), 13, 14), 0),
                          round(runif(round(Desired.Lights * 0.05), 15, 17), 0))
Lights$Lights.Colour <- c(round(runif(Total.Lights, 1, 4), 0))

# Generate the "baubles"
Baubles <- data.frame(Bauble.X = c(6, 9, 15, 17, 5, 13, 16, 7, 10, 14, 7, 9, 11, 
                                   14, 8, 14, 9, 12, 11, 12, 14, 11, 17, 10))
Baubles$Bauble.Y <- c(4, 5, 4, 4, 5, 5, 5, 6, 6, 6, 8, 8, 8, 8, 10,
                      10, 11, 11, 12, 13, 10, 16, 7, 14)
Baubles$Bauble.Colour <- factor(c(1, 2, 2, 3, 2, 3, 1, 3, 1, 1, 1, 2, 1, 2,
                                  3, 3, 2, 1, 3, 2, 1, 3, 3, 1))
Baubles$Bauble.Size <- c(1, 3, 1, 1, 2, 1, 2, 2, 2, 1, 1, 1, 3, 3, 3,
                         2, 3, 1, 1, 2, 2, 3, 3, 2)

# Generate the plot
ggplot() + 
  geom_tile(data = ChristmasTree, aes(x = Tree.X, y = Tree.Y, fill = Tree.Colour)) +
  scale_fill_identity() + 
  geom_point(data = Lights, aes(x = Lights.X, y = Lights.Y, alpha = Lights.Colour),
             colour = "lightgoldenrodyellow", shape = 16) +
  geom_point(data = Baubles, aes(x = Bauble.X, y = Bauble.Y, colour = Bauble.Colour, size = Bauble.Size),
             shape = 16) +
  scale_colour_manual(values = c("firebrick2", "gold", "dodgerblue3")) +
  scale_size_area(max_size = 12) +
  theme_bw() +
  scale_x_continuous(breaks = NULL) + 
  scale_y_continuous(breaks = NULL) +
  geom_segment(aes(x = 2.5, xend = 4.5, y = 1.5, yend = 1.5), colour = "blueviolet", size = 2) +
  geom_segment(aes(x = 5.5, xend = 8.5, y = 1.5, yend = 1.5), colour = "dodgerblue3", size = 2) +
  geom_segment(aes(x = 13.5, xend = 16.5, y = 1.5, yend = 1.5), colour = "blueviolet", size = 2) +
  geom_segment(aes(x = 17.5, xend = 19.5, y = 1.5, yend = 1.5), colour = "dodgerblue3", size = 2) +
  geom_segment(aes(x = 3.5, xend = 3.5, y = 0.5, yend = 2.5), colour = "blueviolet", size = 2) +
  geom_segment(aes(x = 7.0, xend = 7.0, y = 0.5, yend = 2.5), colour = "dodgerblue3", size = 2) +
  geom_segment(aes(x = 15.0, xend = 15.0, y = 0.5, yend = 2.5), colour = "blueviolet", size = 2) +
  geom_segment(aes(x = 18.5, xend = 18.5, y = 0.5, yend = 2.5), colour = "dodgerblue3", size = 2) +
  annotate("text", x = 11, y = 20, label = "Merry Christmas!",
           size = 12) +
  labs(x = "", y = "") +
  theme(legend.position = "none")
```
